humor dataset
Dark & Stormy: Modeling Humor in the Worst Sentences Ever Written
Govindarajan, Venkata S, Biester, Laura
Textual humor is enormously diverse and computational studies need to account for this range, including intentionally bad humor. In this paper, we curate and analyze a novel corpus of sentences from the Bulwer-Lytton Fiction Contest to better understand "bad" humor in English. Standard humor detection models perform poorly on our corpus, and an analysis of literary devices finds that these sentences combine features common in existing humor datasets (e.g., puns, irony) with metaphor, metafiction and simile. LLMs prompted to synthesize contest-style sentences imitate the form but exaggerate the effect by over-using certain literary devices, and including far more novel adjective-noun bigrams than human writers. Data, code and analysis are available at https://github.com/venkatasg/bulwer-lytton
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.05)
- North America > United States > New York (0.04)
- (9 more...)
Can Pre-trained Language Models Understand Chinese Humor?
Chen, Yuyan, Li, Zhixu, Liang, Jiaqing, Xiao, Yanghua, Liu, Bang, Chen, Yunwen
Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Singapore > Central Region > Singapore (0.05)
- Asia > China > Shanghai > Shanghai (0.05)
- (10 more...)